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Abstract 

Background: Stx bacteriophages are responsible for driving the dissemination of Stx toxin genes (stx) across their 
bacterial host range. Lysogens carrying Stx phages can cause severe, life-threatening disease and Stx toxin is an 
integral virulence factor. The Stx-bacteriophage vB_EcoP-24b, commonly referred to as CD24b, is capable of multiply 
infecting a single bacterial host cell at a high frequency, with secondary infection increasing the rate at which 
subsequent bacteriophage infections can occur. This is biologically unusual, therefore determining the genomic 
content and context of CD24b compared to other lambdoid Stx phages is important to understanding the factors 
controlling this phenomenon and determining whether they occur in other Stx phages. 

Results: The genome of the Stx2 encoding phage, 024q was sequenced and annotated. The genomic organisation 
and general features are similar to other sequenced Stx bacteriophages induced from Enterohaemorrhagic 
Escherichia coli (EHEC), however 024q possesses significant regions of heterogeneity, with implications for phage 
biology and behaviour. The CD24b genome was compared to other sequenced Stx phages and the archetypal 
lambdoid phage, lambda, using the Circos genome comparison tool and a PCR-based multi-loci comparison 
system. 

Conclusions: The data support the hypothesis that Stx phages are mosaic, and recombination events between the 
host, phages and their remnants within the same infected bacterial cell will continue to drive the evolution of Stx 
phage variants and the subsequent dissemination of shigatoxigenic potential. 



Background 

Shiga toxin encoding bacteriophages (Stx phages) are re- 
sponsible for converting the pathogenic profiles of their 
bacterial hosts. Enterohaemorrhagic Escherichia coli 
(EHEC), a subset of the Shigatoxigenic E, coli (STEC), 
differentiated by their ability to produce attachment and 
effacement lesions, emerged as a serious food borne 
threat to humans in the 1980s [1-3]. The emergence of 
this group of organisms was due to an Stx phage infec- 
tion of a mildly pathogenic progenitor strain [4]. The se- 
vere disease (bloody diarrhoea and haemorrhagic colitis) 
and disease sequelae (haemolytic uraemic syndrome 
[HUS] and thrombotic thrombocytopenic purpura 
[TTP]) caused by EHEC are all linked to the activity of 
the Shiga toxin (Stx) [5], the expression of which is 
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genetically coordinated by the lytic replication cycle of 
Stx phage [6]. Although the global incidence of EHEC 
infection is low, severe disease and death occurs in an 
unacceptably high proportion of infected individuals [7]: 
10% and 3-5%, respectively [8]. 

Stx phages are lambdoid bacteriophages, sharing the 
distinct genome organisation of the archetypal bacterio- 
phage lambda (X) [5]. They possess two replication strat- 
egies: lysogenic, where the phage genome directs its 
integration into the bacterial host genome as a prophage; 
or lytic, where viral progeny are assembled intracellularly 
and released by lysis of the host cell through the action 
of phage encoded lysozyme, holin and pinholin proteins 
[1,9,10]. Production of Stx in the lysogen is linked to the 
latter, and the release of Stx from the lysogen predomin- 
antly coincides with induction of the lytic cycle and bac- 
terial host cell lysis [6]. 

Bacterial genome sequencing projects have highlighted 
the impact that temperate phages have upon bacterial 
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evolution, and those that impact directly on the patho- 
genicity of the host bacterium are known as converting 
phage. In addition to stx genes carried by Stx phages 
and expressed by E. coli, other examples of converting 
phage include the CTX phage encoding the cholera 
toxin genes expressed by Vibrio cholerae [11] and lorn 
and bor of bacteriophage lambda, which affect E, coli ad- 
herence to human buccal epithelial cells [12] and sensi- 
tivity to serum killing [13], respectively. It can be 
postulated that the maintenance of converting phage in 
a lysogen is due to positive selection pressure for pro- 
phage carriage by the host cell in an animal host. 

The bacteriophage vB_EcoP-24b [14], carrying the 
Shiga toxin 2 variant (Stx2) [5] (hereafter referred to as 
(|)24b) has been well characterised [15-21] since its initial 
purification following induction from a clinical isolate of 
E, coli 0157:H7 [22]. c|)24b infects rough and smooth 
strains of E, coli [18] and can adsorb to many members 
of the Enterobacteriaceae, including Salmonella spp 
[18]. The adsorption target for this phage is an essential 
outer membrane protein, BamA, which is involved in 
the biogenesis of the Gram negative bacterial outer 
membrane and is not only highly conserved across 
members of the Enter obacteriaceae, but also conserved 
to some degree in all Gram negative bacteria [20]. 
Using a Stx phage multi loci gene typing system [21], it 
was demonstrated that >70% of Stx phages share a 
gene responsible for the short-tailed phage morphotype 
that enables adsorption to BamA [20]. c|)24b also has the 
ability to multiply infect a single host cell and integrate 
into different sites across the E.coli chromosome 
[16,17,22], a behaviour which departs from the lambda 
phage immunity dogma [15]. This could act to not only 
increase the pathogenic profile of the host with each 
subsequent infection [23], but also enable recombin- 
ation events between resident inducible and cryptic pro- 
phages, promoting the production and release of novel 
recombinant phage mosaics. 

The objectives of this study were to sequence the gen- 
ome of (|)24b and apply comparative genomic analyses to 
highlight important genetic similarities and differences 
across the Stx phages sequenced to date. The ultimate 
aim is to identify potential effectors controlling the biol- 
ogy of these phages and the expression of genes that 
provide a selective advantage to either the bacterial lyso- 
gen or to the phages themselves. 

Results and discussion 

Genome annotation 

Phage genes are usually small in size (< 1 kb), and very 
few of them have been subjected to detailed biochemical/ 
functional characterisation, which makes the definitive an- 
notation of phage genomes challenging. Notwithstanding 
the difficulties inherent in the production of informative 



phage genome annotation, the sequencing and subsequent 
annotation of the c|)24b genome is reported here 
[HM208303]. Its genomic organisation confirms that (|)24b 
is a lambdoid phage sharing similar overall genetic context 
with bacteriophage lambda (Figure 1). Annotation of the 
57,677 bp genome revealed 88 putative coding regions 
(CDS, including the stX2AB genes of which the B gene is 
not annotated due to allelic replacement with the chlor- 
amphenicol resistance gene from pLysS [Novagen]), com- 
prised of 26 CDS (30%) that shared a high level of 
sequence similarity with those of known function in other 
lambdoid phages (with or without stx genes); three CDS 
(vb_24B 2c, 4c and 25c), which have never been identified 
previously; eleven CDS sharing some, but not complete, 
homology to those genes with poorly defined roles in 
lambdoid phage biology; and 48 CDS encoding proteins of 
unknown function (55%), but are found in association 
with other lambdoid phages (Additional file 1: Table SI). 
A comparison of the number of genes encoding proteins 
of undetermined function in Stx phages and the number 
of hypothetical proteins encoded by sequenced E. coli iso- 
lates (Figure 2), demonstrates that Stx phages carry a 
greater percentage of hypothetical genes than their E. coli 
hosts, 55% verses 24%, respectively (Figure 2), especially 
remarkable considering the size differential between the 
bacteriophage and bacterial genomes, but not an uncom- 
mon occurrence in bacteriophage genomes [24]. An ana- 
lysis of the annotated 024b genome with CGView [25,26] 
(Figure 1) shows that hypothetical genes are particularly 
common in the late gene region of the phage; downstream 
of the antiterminator Q, 44 c|)24b genes were annotated 
(both strands) of which 32 (73%) are designated as hypo- 
thetical. Because of their location in the late gene region, 
their expression is likely to be linked to prophage induc- 
tion/phage replication unless they are morons (horizon- 
tally acquired genes with no function for the phage, but 
usually beneficial to the bacterial host), uncoupled from 
the standard regulatory networks [27]. Expression analyses 
of these 32 genes is necessary to determine if they have 
been carried along via in situ recombination events with- 
out impacting the bacterial host or phage replication 
machinery or if they have been retained in the genome 
under their own expression control (or linked to other 
regulatory networks) because they benefit the bacterial 
host or phage replication. 

Two unexpectedly large genes were identified in the 
cD24b genome sequence. The first of these large genes, 
vb_24B 48, is predicted to encode a protein of 2,808 aa 
and is located close to the right end of the genome 
(Figure 1). This gene is also carried by other Stx phages 
including 933W, VT2-Sa, Stx2 II, Stx2 converting bac- 
teriophage 86 and Min27. Gene vb_24B 48 homologues 
have also been identified within bacterial genomes carry- 
ing non-Stx prophages, e.g. Salmonella enterica subsp. 
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Figure 1 CGView-derived schematic of tiie 02% genome; tiie concentric rings include the annotation, location and direction of 
expression. Genes that are detailed in the centre of the genome and suffixed with a 'c' are expressed from the complimentary strand. The 
internal concentric rings indicate +/- GC skew and GC content. 



enterica serovar Kentucky isolate (ZP_0258689) encodes 
a gene sharing 1128 of the 1611 amino acid residues. 
The predicted protein of vb_24B 48 has no easily assign- 
able function, but does possess a partial COG 1483 do- 
main (associated with the AAA + superfamily of ATPases 
by general function prediction) between residues 345 
and 1176; Signal? analysis [28] indicates that the first 15 
nucleotides might function as a leader peptide. The pro- 
tein encoded by vb_24B 48 has no homology with any 
protein subjected to conventional functional analysis, 
but TMPred [29,30] predicts that the protein possesses 
membrane-spanning domains. This protein has many of 
the characteristics of the giant genes that typically en- 
code surface proteins involved in bacterial fitness [31], 



and this could be relevant to its conservation among Stx 
phages. The second large gene, P (2906 bp), encodes the 
polymerase for 024b replication. Pcd24b possesses a num- 
ber of well characterised and conserved domains, includ- 
ing an intact TOPRIM_primase domain (cd01029) at the 
amino terminus and an intact P loop NTPase superfam- 
ily domain (cl09099) at the carboxyl terminus, specific- 
ally harbouring the GP4d_helicase domain (cd01122). 
An orthologue of Po24b has been found in association 
with a Shigella flexneri prophage (YP_690085.1, sharing 
955 of the 968 amino acid residues), and in Stx phage 
Min27 (YP_001648921.1) with an amino acid identity of 
87%. Po24B carries an intein [32,33], interrupting amino 
acid residues 372-702, and includes an intact HintN 
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Figure 2 Percentage levels of unknown/hypothetical genes with no inferred function from all sequenced and annotated E coli strains 
and Stx phages available on Genbank. The open box encompasses the current levels of annotated genes with no inferred function in Stx 
phage genomes. 

J 



domain (cll2032), specifically of the smart00306 super- 
family. Comparison of the CDS sequence excluding the 
intein shows that Po24b shares significant identity with a 
number of prophage or bacteriophage encoded proteins 
including ZP_0795005.1 and ZP_04535347.1 associated 
with an unclassified member of the Enter obacteriaceae 
(NZ_ADCUOOOOOOOO), and an unidentified Escherichia 
isolate (NZ_DS999462.1), respectively. It is very likely 
that the activation of the intein will play a role in the 
post-translational regulation of the replication protein, 
but this, as well as the basic function of the intein, has 
yet to be experimentally determined. 

(|)24b also harbours the two accessory genes, lorn and 
bor [5], that in bacteriophage lambda are not involved in 
phage replication, but do affect the fitness of lambda 
lysogens in mammalian hosts [13,34] and are expected 
to play similar fitness roles in this Stx phage and other 
Stx phages that carry these genes. 

Genome comparisons 

024b was compared to eleven previously sequenced Stx 
phages [35-43] (though Stx2 bacteriophage 86 ([AB255436] 
is unpublished) and bacteriophage lambda [39]. The ana- 
lysis presented in Figure 3 highlights the mosaic nature of 
these lambdoid phages. The most similar Stx phages to 
cD24b are Min27 [36], 933 W [41], VT2-Sakai [42], and the 
Stx2 converting phages 1 and 2 [35], which lil<e 024b all 
possess a PodoviridaeAike morphology. These phages rep- 
resent a global collection of Stx phages associated with inci- 
dents of human STEC infection from around the world e.g. 



933W (US), Sakai and all Stx2 converting phages I, II and 
86 (Japan), Stx 2 converting phage 1717 (Canada), cD24b 
(UK) and Min27 (China). These phages all share regions of 
homology with one another, but the degree of shared iden- 
tity differs between phages, and no two phage are identical. 

There is evidence that most circulating Stx phages are 
short-tailed Podoviridae [5,20,21,44,45], which have 
evolved an almost perfect infection strategy utilising an 
essential, highly conserved, outer membrane protein 
BamA (previously YaeT) for host cell recognition and 
adsorption [20]. This essential adsorption target, the fact 
that many outbreak strains carry more than one Stx 
phage [46,47], and the capacity of at least some Stx 
phages to multiply infect a single host cell [15-17,22,48] 
is likely to foster many opportunities to drive phage evo- 
lution through in situ recombination events. Thus the 
similarities in genome content across the short-tailed 
phages depicted in Figure 3, excluding lambda and Phi 
27 that lie outside this group, may be a consequence of 
such recombination [49]. 

Genomic comparison has also shown that although 
many of the genes carried by Stx phages encode hypothet- 
ical proteins, there are recognisable accessory genes with 
activities that have been characterised in other systems, e,g, 
exo, gam, bet, lar, lorn, bor and sth The genes exo, gam and 
bet are the three components of the lambda-encoded Red 
recombinase system [50]. The products of these genes 
increase DNA recombination rates, which is likely to drive 
the creation of novel phages and extend bacterial host 
ranges through in situ recombination events between 
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Figure 3 (See legend on next page.) 
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(See figure on previous page.) 

Figure 3 Multi-genome comparison of all sequenced Stx phages, the archetypal lambdoid phage. Lambda and 024b . A. Circos map 
depicting the MuMer alignment results with respect to Stx2 phage CD24b. Each coloured segment represents a phage genome with the numbers 
on the external surface indicating genome size in kb. Inside the genome ring are hatch marks indicating gene locations and their respective 
coding strands. The inner circle is composed of coloured blocks that are indicative of gene conservation with 024^. The coloured swept arcs 
indicate sequence conservation and orientation of those sequences with respect to 024q. B. A multi-loci comparison [21]. Loci corresponding 
to the genome annotation that have been marked are loci that have been used in previous multi-loci typing of Stx bacteriophages or are 
defined in Additional file 1: Table SI. 



resident inducible and cryptic prophages, as well as infect- 
ing phages in the bacterial lysogen [5]. The gene lar 
encodes a protein involved in the alleviation of restriction 
systems [51], which are often used by bacteria as a primary 
defense against phage infection [52]. The genes lorn and 
hor encode products that enhance the lysogens ability to 
colonise its host [13,34], and stk encodes a kinase with an 
as yet unidentified impact on the lysogen or the lysogen s 
host [53], but it is clear that stk expression is controlled by 
the pRM promoter, and its expression occurs only under 
conditions of stable lysogeny [54]. 

The genes associated with the genetic switch, control- 
ling the behaviour of these phages and their decision to 
enter the lysogenic or lytic replicative cycles {e.g. cl, Q 
and AO, are present across all lambdoid phages, though 
distinct sequence variants are known (Figure 3). A PCR- 
based multilocus characterisation system developed for 
Stx phages [21] was applied to the 11 sequenced Stx 
phages and lambda (Figure 3B). The integrase gene of 
cD24b [16,17] is also carried by the Stx2 converting 
phages 86 and 1717. All three phages possess the int 
genes in a genomic orientation opposite to the lambda 
phage integrase gene. The cD24B-like integrase gene is 
under the control of its own promoter region [55] in all 
three phages from where it is likely to drive high fre- 
quency superinfection events [17]. The cD24b cIII gene is 
not present in P27, but in the other phages it is well 
conserved sharing at least 99% aa identity. The antiter- 
minator, N, involved in early gene expression, is present 
in one of three forms in all but phage P27. Nl [21] is 
present in 024b, 933W, Stx2 converting phage I, Min27 
and BP 4795, all sharing at least 98% identity, and N2 
[21] is carried by VT2Sa, Stx2 phage 1717 and YYZ- 
2008, whilst Lambda possesses a third variant (Additional 
file 1: Table SI).; the three variants can share as little as 
22% sequence identity. The cl gene product, the regula- 
tor controlling maintenance of lysogeny through re- 
pression of the lytic life cycle, was identified in five 
variant forms. The repressor of Stx phages 933W, 
Min27, Stx2 converting I and Stx2 converting phage 86 
all possess cljb, while BP-4795 possesses clja^ which 
shares 69% overall identify with the clla protein and 
100% identity at the carboxy terminal half. Sequence 
and structure/function predictions mean that the 
altered amino terminus is likely to have different DNA 



binding properties, whilst retaining similar dimerization 
properties that are key to its function [56]. The cl2c 
genes from Vt2-Sa, Stx2 phage II, YYZ-2008 and Stx2 
phage 1717 all share sequence identity across the entire 
coding region of the cl gene, though they are currently 
annotated with different amino termini. The VT2-Sa cl 
gene amplifies with the cl2c primers, but a single nu- 
cleotide polymorphism has introduced a stop codon 
and thus ablates 60 amino acids from the amino 
terminus, probably destroying the ability of this repres- 
sor protein to bind DNA; this may, at least partly, explain 
the non-inducible nature of this prophage [57] . The arche- 
typal Lambda repressor (Cl2a) shares 100% identity at its 
carboxy terminus with the Che variants, but its amino 
terminal end is unique, and again implies that it binds 
DNA differently from the Che variants. The Stx2 con- 
verting phage I possesses the cly variant (Additional file 
1: Table SI) not previously included in the Stx phage 
multilocus PGR typing system [21]. Orthologues of the 
cro gene product (Cros) are carried by Stx phages 
933W, Stx2 converting phage 86, 024b and Min27 and 
are all identical at the aa level. The cro gene variant {cro4) 
is carried by Stx2 converting phages 1717 and II as well as 
VT2-Sa, again sharing 100% amino acid identity. Lambda 
phage encodes Croi; BP4795, Cro9; YYZ2008, Croio; Stx2 
converting phage I, Cron and P27, Croi2. All the diversity 
seen across the cl variants and the lack of association of 
specific cl genes with specific cro genes (Figure 3B) has 
been predicted [58], providing evidence of repressor/ 
operator coevolution. This coevolution has been pre- 
dicted to drive superinfection immunity groups and 
thus effect the production of new and novel Stx phage 
mosaics [5]. Only the CII from Min27 is completely 
identical to that of cI)24B; all the other phages in the 
Circos comparison, apart from P27 and Lambda, have 
CII proteins that are approximately 86% identical at the 
protein level. Lambda CII has the lowest sequence iden- 
tity at 36% and no orthologue was identified in P27. 

Only Stx phage Min27 carries O and P genes (O2 P2; 
Additional file 1: Table SI) like those carried by 024b 
(99 and 98% identity, respectively). Across all of the 
phages, there were five distinct DNA replication systems 
encoded, with little homology shared between each sys- 
tem. Oi/Pi is carried by Lambda phage, 933W, Stx2 
converting phage I and BP-4795; O3/P3 is carried by 
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Stx2 converting phage II, VT2-Sa and Stx2 converting 
phage 86, O4/P4 is carried by P27 and O5/P5 is carried 
by Stx2 converting phage 1717 (Figure 3B). These two- 
protein systems would therefore be a suitable additional 
diversity marker for phage characterisation (Additional 
file 1: Table SI). The lytic induction enhancer. Ant, [55] 
can also be identified in genomic context within the gen- 
omes of Min27 (97%), VT2Sa and Stx2 converting phage 
II (78%) and Stx2 converting phage 1717 (73%) 
(Figure 3A). Downstream of Ant is a gene encoding a 
protein of similar predicted conformation, Roi, which 
shares its 125 amino- terminal amino acid sequence (242 
a.a. in total) with Roi from bacteriophage HK022 [59]. In 
bacteriophage HK022, Roi has been implicated in phage 
lytic growth [59]. Roio24B is identical at the protein level 
to RoiMin27) and possesses 99% sequence identity to the 
Roi genes of five of the other Stx phages. Roi encoded 
by genes carried by Stx2 converting phage II and VT2Sa, 
and Stx converting bacteriophage 86 are still distinctly 
similar but share lower identity to Roio24B (89 and 83%, 
respectively); in all cases the genomic context of Roi in 
these Stx phages is preserved. The protein product of 
the antiterminator gene Q is widely conserved (>98% 
identity) throughout the Stx phages, as it is in all lambd- 
oid phages [60]. The well conserved short tail of cD24b is 
widespread across Stx phages [21] due to its outer mem- 
brane protein adsorption target that is itself highly con- 
served and an essential gene in the bacterial host [20]. 
Examination of the distribution and similarity of the 
gene encoding this short tail structure across the 
sequenced Stx phages, 933W, VT2Sa, Min27, Stx2 con- 
verting phage 2 and 024b reveals 99% sequence identity 
at the protein level. This 1% difference is simply due to 
different start codons. Stx2 converting phage 1 pos- 
sesses a tail gene with 95% identity to 024b. 

A Jaccard dissimilarity dendrogram (Figure 3B) was 
created from data on the presence or absence of the 
gene variants associated with each sequenced genome. 
The dendrogram illustrates the high level of genetic di- 
versity that exists amongst these 11 Stx phages, with no 
two phage possessing an identical genetic profile. This 
further demonstrates the genetic heterogeneity of Stx 
phages previously revealed by PGR multilocus typing of 
phage pools induced from STEC strains (55). 

The most challenging question in phage genomics is: 
What is the function of the uncharacterised genes that 
dominate bacteriophage genomes? Phage genomes are 
normally small and compact, and it is likely that many 
of the genes of unknown function have been main- 
tained in this dynamic pool by positive selection pressure. 
Most Stx phages have larger genomes than bacteriophage 
lambda, so carry more genes that are not required for 
core lambdoid phage replication and life cycle control. 
The suggestion that these accessory genes have roles in 



the fitness of either the Stx phages themselves or their 
bacterial hosts can be made with some confidence. 

Conclusions 

Over the last 10 years, the phage research community 
has begun to use genomic analyses to compare double 
stranded DNA phages, most extensively with respect to 
the comparative genomics of mycobacteriophages or 
their lysogens [61-69]. Bacteriophages are significant dri- 
vers of bacterial evolution because of their ability to dis- 
seminate DNA across their host range, either as 
converting phages [70] or through both generalised (59) 
and specialised (25) transduction. By identif)^ing genetic 
variation in groups of phage which impact upon the 
phenotypic profiles of their hosts, it may be possible to 
infer biological roles for the numerous hypothetical 
proteins identified in translated bacteriophage genome 
sequences. 

In this full genomic comparison of eleven Stx phages 
we have demonstrated that no two sequenced Stx phage 
are identical. All of the lambdoid phages are mosaics, 
sharing genomic loci and genomic synteny, but to vary- 
ing degrees. The short-tailed Stx phages possess more 
genomic relatedness, which may be driven by their 
shared host range (due to the adsorption target, BamA) 
enabling appreciable levels of genomic recombination, 
facilitating efficient recombination of and selection for 
genetic material carried by these phages. The phage 
backbone of P27 is very different from the other Stx 
phages and may be the result of a productive recombin- 
ation even between a non-lambdoid and a lambdoid 
phage, as many key regulatory lambdoid phage elements 
cannot be identified within the P27 genome. However, 
the Shiga toxin genes remain linked to the Q gene. It 
has been reported before that lambdoid phages appear 
to possess most genetic morons within the late gene re- 
gion [27], and the Stx phages hold true to this observa- 
tion. The conserved nature of many of these morons, 
which are likely to confer some as yet unidentified prop- 
erty to their host cell, indicate that Stx phages are likely 
to contribute more to their pathogenic bacterial host 
than toxin production. Understanding these factors is 
likely to be important to understanding the evolution of 
EHEC and other Shiga toxin producing enteric pathogens. 

Genomic approaches to phage biology provide the 
means to examine the growing number of novel bacterio- 
phages isolated directly from different environments, 
induced from their bacterial hosts or identified as pro- 
phages in sequenced bacterial genomes. Deep pyrosequen- 
cing technologies, enabling metaviral analyses of 
environmental samples, are further driving our under- 
standing and appreciation of bacteriophage genomics and 
the bacteriophage pan-genome [71,72]. Assigning defini- 
tive or putative functions to the hypothetical proteins that 
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are the expressed products of the majority of bacterio- 
phage genes remains the main barrier to significant pro- 
gress in unravelling bacteriophage biology. 

Methods 

Bacterial strains and bacteriophages 

The E, coll C derivative strain WG5^^^^ and the E, coli K12 
strain DM1187 have been used to isolate and propagate a 
number of Stx phages previously [15,16,18,21,22,73]. Un- 
less stated otherwise, these bacterial strains were grown in 
Luria-Bertani broth (VWR) or on plates prepared by 
addition of 1.5% (w/v) agar (Difco). The engineered vari- 
ant of cD24b sequenced in this study, cD24B::Cat [22], 
possesses a stx operon that has been replaced with the cat 
gene, which confers chloramphenicol resistance upon its 
lysogen. 

VB_ECOP-24B::Cat (024B::Cat) DNA extraction for genome 
sequencing 

Agar plates with semi-confluent plaques of 024B::Cat 
were flooded with 3 ml of SM buffer (50 mM Tris CI 
[pH 7.5], 0.1 M NaCl, 10 mM MgS04,) [74] and gently 
agitated overnight at 4°C. The SM buffer was harvested 
and the plate flooded again with SM buffer. The top agar 
containing the plaques and the second volume of SM 
buffer were then scraped from the agar plates and added 
to the former sample. This mixture was vortexed, and 
the top agar and bacterial debris pelleted by centrifuga- 
tion (10,000 g, 10 min). Chloroform (30 \i\ 10 mL'^) was 
added to the recovered supernatant to inactivate any 
remaining bacterial cells. Contaminating bacterial DNA 
and RNA were removed by the addition of DNAse 
(Ambion; 5 [ig mL'^) and RNAse (1 [ig mL'^), and the 
mixtures were incubated at 37°C for 1 hr. The phages 
present were precipitated in the presence of 33% PEG 
8000 (Sigma) on ice for 30 min and recovered by centri- 
fugation at 10,000 g for 10 min. The resulting phage pel- 
let was suspended in 500 [A of SM per 30 ml original vol 
followed by a further DNAse and RNAse digestion. The 
viral nucleic acid was purified following two extractions 
with an equal vol 25:24:1 phenol:chloroform:isoamyl al- 
cohol and centrifugation (14,500 g, 30 min). The DNA 
present was precipitated by the addition of 0.6 vol iso- 
propanol. The DNA was harvested by centrifugation 
(14,500 g for 30 min), washed with 70% ethanol and 
allowed to air dry. It was then suspended in 100 [A of 
distilled H2O [60]. 

024B::Cat Sequencing and annotation 

The 024B::Cat phage genome was sequenced at the 
Welcome Trust Sanger Institute. The phage DNA was 
randomly sheared by sonication and a library produced 
by cloning fragments into the plasmid pUC19 (New 
England Biolabs). The phage genome was sequenced to 



provide lOx coverage using the ABI3730 sequencer (Ap- 
plied Biosystems). Assembly of the sequence was accom- 
plished using Phrap, and contiguous sequence was 
assembled using GAP4. The phage DNA predicted cod- 
ing genes were identified using ORPHEUS28 and GLIM- 
MER29 and these predictions were combined and 
annotated in Artemis [75] by comparison against the 
non redundant database using BLASTN and TBLASTX 
[76]. Putative coding sequences were added to the anno- 
tation if they contained both start and stop codons and a 
probable ribosome binding site. 

Genome comparison 

The accession numbers for the Stx phages used for the 
genome comparison were: cD24B::Cat (HM208303), 933W 
(AF125520), P27 (AJ298298), Min27 (EU311208), Stx2 
Converting phage I (AP004402), Stx2 Converting phage II 
(AP005154), Stx2 Converting phage 86 (AB255436), Stx2 
Converting phage 1717 (FJ188381), VT2-Sakai 
(AP000363) YYZ 2008 (FJ184280), BP-4795 (AJ556162) 
and non-Stx encoding bacteriophage Lambda (J02459). 
Comparative genome analysis was performed using 
MUMmer version 3 [77] and visualized using CIRCOs 
[78]. Coordinates were generated using NUCmer [77] with 
the parameters breaklen, maxgap, mincluster, and min- 
match set to 200, 90, 65 and 20, respectively. 

R-based loci comparisons 

The presence of bands from each individual amplifica- 
tion reaction, using primer pairs specific for variant loci 
[21], was used as the data for construction of a binary 
similarity matrix. Computation script was written using 
R version 2.11.1, to enable visualisation of the variant of 
each genetic locus present. 

Additional file 



Additional file 1: Table 51. Suggested primer set additior^s for Stx 
phage characterisation. 
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